Relative Error Embeddings of the Gaussian Kernel Distance

نویسندگان

  • Di Chen
  • Jeff M. Phillips
چکیده

A reproducing kernel defines an embedding of a data point into an infinite dimensional reproducing kernel Hilbert space (RKHS). The norm in this space describes a distance, which we call the kernel distance. The random Fourier features (of Rahimi and Recht) describe an oblivious approximate mapping into finite dimensional Euclidean space that behaves similar to the RKHS. We show in this paper that for the Gaussian kernel the Euclidean norm between these mapped to features has (1 + ε)-relative error with respect to the kernel distance. When there are n data points, we show that O((1/ε) logn) dimensions of the approximate feature space are sufficient and necessary. Without a bound on n, but when the original points lie in R and have diameter bounded by M, then we show that O((d/ε) logM) dimensions are sufficient, and that this many are required, up to log(1/ε) factors. We empirically confirm that relative error is indeed preserved for kernel PCA using these approximate feature maps.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

تشخیص سرطان پستان با استفاده از برآورد ناپارمتری چگالی احتمال مبتنی بر روش‌‌های هسته‌ای

Introduction: Breast cancer is the most common cancer in women. An accurate and reliable system for early diagnosis of benign or malignant tumors seems necessary. We can design new methods using the results of FNA and data mining and machine learning techniques for early diagnosis of breast cancer which able to detection of breast cancer with high accuracy. Materials and Methods: In this study,...

متن کامل

Kernel-Based Just-In-Time Learning for Passing Expectation Propagation Messages SUPPLEMENTARY MATERIAL A MEDIAN HEURISTIC FOR GAUSSIAN KERNEL ON MEAN EMBEDDINGS

In the proposed KJIT, there are two kernels: the inner kernel k for computing mean embeddings, and the outer Gaussian kernel κ defined on the mean embeddings. Both of the kernels depend on a number of parameters. In this section, we describe a heuristic to choose the kernel parameters. We emphasize that this heuristic is merely for computational convenience. A full parameter selection procedure...

متن کامل

Fast High-dimensional Kernel Summations Using the Monte Carlo Multipole Method

We propose a new fast Gaussian summation algorithm for high-dimensional datasets with high accuracy. First, we extend the original fast multipole-type methods to use approximation schemes with both hard and probabilistic error. Second, we utilize a new data structure called subspace tree which maps each data point in the node to its lower dimensional mapping as determined by any linear dimensio...

متن کامل

The Relative Improvement of Bias Reduction in Density Estimator Using Geometric Extrapolated Kernel

One of a nonparametric procedures used to estimate densities is kernel method. In this paper, in order to reduce bias of  kernel density estimation, methods such as usual kernel(UK), geometric extrapolation usual kernel(GEUK), a bias reduction kernel(BRK) and a geometric extrapolation bias reduction kernel(GEBRK) are introduced. Theoretical properties, including the selection of smoothness para...

متن کامل

Wasserstein Distance Measure Machines

This paper presents a distance-based discriminative framework for learning with probability distributions. Instead of using kernel mean embeddings or generalized radial basis kernels, we introduce embeddings based on dissimilarity of distributions to some reference distributions denoted as templates. Our framework extends the theory of similarity of Balcan et al. (2008) to the population distri...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017